19 research outputs found

    Robustness and Interpretability of Neural Networks’ Predictions under Adversarial Attacks

    Get PDF
    Le reti neurali profonde (DNNs) sono potenti modelli predittivi, che superano le capacità umane in una varietà di task. Imparano sistemi decisionali complessi e flessibili dai dati a disposizione e raggiungono prestazioni eccezionali in molteplici campi di apprendimento automatico, dalle applicazioni dell'intelligenza artificiale, come il riconoscimento di immagini, parole e testi, alle scienze più tradizionali, tra cui medicina, fisica e biologia. Nonostante i risultati eccezionali, le prestazioni elevate e l’alta precisione predittiva non sono sufficienti per le applicazioni nel mondo reale, specialmente in ambienti critici per la sicurezza, dove l'utilizzo dei DNNs è fortemente limitato dalla loro natura black-box. Vi è una crescente necessità di comprendere come vengono eseguite le predizioni, fornire stime di incertezza, garantire robustezza agli attacchi avversari e prevenire comportamenti indesiderati. Anche le migliori architetture sono vulnerabili a piccole perturbazioni nei dati di input, note come attacchi avversari: manipolazioni malevole degli input che sono percettivamente indistinguibili dai campioni originali ma sono in grado di ingannare il modello in predizioni errate. In questo lavoro, dimostriamo che tale fragilità è correlata alla geometria del manifold dei dati ed è quindi probabile che sia una caratteristica intrinseca delle predizioni dei DNNs. Questa condizione suggerisce una possibile direzione al fine di ottenere robustezza agli attacchi: studiamo la geometria degli attacchi avversari nel limite di un numero infinito di dati e di pesi per le reti neurali Bayesiane, dimostrando che, in questo limite, sono immuni agli attacchi avversari gradient-based. Inoltre, proponiamo alcune tecniche di training per migliorare la robustezza delle architetture deterministiche. In particolare, osserviamo sperimentalmente che ensembles di reti neurali addestrati su proiezioni casuali degli input originali in spazi basso-dimensionali sono più resistenti agli attacchi. Successivamente, ci concentriamo sul problema dell'interpretabilità delle predizioni delle reti nel contesto delle saliency-based explanations. Analizziamo la stabilità delle explanations soggette ad attacchi avversari e dimostriamo che, nel limite di un numero infinito di dati e di pesi, le interpretazioni Bayesiane sono più stabili di quelle fornite dalle reti deterministiche. Confermiamo questo comportamento in modo sperimentale nel regime di un numero finito di dati. Infine, introduciamo il concetto di attacco avversario alle sequenze di amminoacidi per protein Language Models (LM). I modelli di Deep Learning per la predizione della struttura delle proteine, come AlphaFold2, sfruttano le architetture Transformer e il loro meccanismo di attention per catturare le proprietà strutturali e funzionali delle sequenze di amminoacidi. Nonostante l'elevata precisione delle predizioni, perturbazioni biologicamente piccole delle sequenze di input, o anche mutazioni di un singolo amminoacido, possono portare a strutture 3D sostanzialmente diverse. Al contempo, i protein LMs sono insensibili alle mutazioni che inducono misfolding o disfunzione (ad esempio le missense mutations). In particolare, le predizioni delle coordinate 3D non rivelano l'effetto di unfolding indotto da queste mutazioni. Pertanto, esiste un'evidente incoerenza tra l'importanza biologica delle mutazioni e il conseguente cambiamento nella predizione strutturale. Ispirati da questo problema, introduciamo il concetto di perturbazione avversaria delle sequenze proteiche negli embedding continui dei protein LMs. Il nostro metodo utilizza i valori di attention per rilevare le posizioni degli amminoacidi più vulnerabili nelle sequenze di input. Le mutazioni avversarie sono biologicamente diverse dalle sequenze di riferimento e sono in grado di alterare in modo significativo le strutture 3D.Deep Neural Networks (DNNs) are powerful predictive models, exceeding human capabilities in a variety of tasks. They learn complex and flexible decision systems from the available data and achieve exceptional performances in multiple machine learning fields, spanning from applications in artificial intelligence, such as image, speech and text recognition, to the more traditional sciences, including medicine, physics and biology. Despite the outstanding achievements, high performance and high predictive accuracy are not sufficient for real-world applications, especially in safety-critical settings, where the usage of DNNs is severely limited by their black-box nature. There is an increasing need to understand how predictions are performed, to provide uncertainty estimates, to guarantee robustness to malicious attacks and to prevent unwanted behaviours. State-of-the-art DNNs are vulnerable to small perturbations in the input data, known as adversarial attacks: maliciously crafted manipulations of the inputs that are perceptually indistinguishable from the original samples but are capable of fooling the model into incorrect predictions. In this work, we prove that such brittleness is related to the geometry of the data manifold and is therefore likely to be an intrinsic feature of DNNs’ predictions. This negative condition suggests a possible direction to overcome such limitation: we study the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks and prove that, in this limit, they are immune to gradient-based adversarial attacks. Furthermore, we propose some training techniques to improve the adversarial robustness of deterministic architectures. In particular, we experimentally observe that ensembles of NNs trained on random projections of the original inputs into lower dimensional spaces are more resilient to the attacks. Next, we focus on the problem of interpretability of NNs’ predictions in the setting of saliency-based explanations. We analyze the stability of the explanations under adversarial attacks on the inputs and we prove that, in the large-data and overparameterized limit, Bayesian interpretations are more stable than those provided by deterministic networks. We validate this behaviour in multiple experimental settings in the finite data regime. Finally, we introduce the concept of adversarial perturbations of amino acid sequences for protein Language Models (LMs). Deep Learning models for protein structure prediction, such as AlphaFold2, leverage Transformer architectures and their attention mechanism to capture structural and functional properties of amino acid sequences. Despite the high accuracy of predictions, biologically small perturbations of the input sequences, or even single point mutations, can lead to substantially different 3d structures. On the other hand, protein language models are insensitive to mutations that induce misfolding or dysfunction (e.g. missense mutations). Precisely, predictions of the 3d coordinates do not reveal the structure-disruptive effect of these mutations. Therefore, there is an evident inconsistency between the biological importance of mutations and the resulting change in structural prediction. Inspired by this problem, we introduce the concept of adversarial perturbation of protein sequences in continuous embedding spaces of protein language models. Our method relies on attention scores to detect the most vulnerable amino acid positions in the input sequences. Adversarial mutations are biologically diverse from their references and are able to significantly alter the resulting 3D structures

    Conditioning Score-Based Generative Models by Neuro-Symbolic Constraints

    Full text link
    Score-based and diffusion models have emerged as effective approaches for both conditional and unconditional generation. Still conditional generation is based on either a specific training of a conditional model or classifier guidance, which requires training a noise-dependent classifier, even when the classifier for uncorrupted data is given. We propose an approach to sample from unconditional score-based generative models enforcing arbitrary logical constraints, without any additional training. Firstly, we show how to manipulate the learned score in order to sample from an un-normalized distribution conditional on a user-defined constraint. Then, we define a flexible and numerically stable neuro-symbolic framework for encoding soft logical constraints. Combining these two ingredients we obtain a general, but approximate, conditional sampling algorithm. We further developed effective heuristics aimed at improving the approximation. Finally, we show the effectiveness of our approach for various types of constraints and data: tabular data, images and time series

    Adversarial Learning of Robust and Safe Controllers for Cyber-Physical Systems

    Get PDF
    We introduce a novel learning-based approach to synthesize safe and robust controllers for autonomous Cyber-Physical Systems and, at the same time, to generate challenging tests. This procedure combines formal methods for model verification with Generative Adversarial Networks. The method learns two Neural Networks: the first one aims at generating troubling scenarios for the controller, while the second one aims at enforcing the safety constraints. We test the proposed method on a variety of case studies

    On the Robustness of Bayesian Neural Networks to Adversarial Attacks

    Full text link
    Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit BNN posteriors are robust to gradient-based adversarial attacks. Crucially, we prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks. Experimental results on the MNIST, Fashion MNIST, and half moons datasets, representing the finite data regime, with BNNs trained with Hamiltonian Monte Carlo and Variational Inference, support this line of arguments, showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free based adversarial attacks.Comment: arXiv admin note: text overlap with arXiv:2002.0435

    Prescription appropriateness of anti-diabetes drugs in elderly patients hospitalized in a clinical setting: evidence from the REPOSI Register

    Get PDF
    Diabetes is an increasing global health burden with the highest prevalence (24.0%) observed in elderly people. Older diabetic adults have a greater risk of hospitalization and several geriatric syndromes than older nondiabetic adults. For these conditions, special care is required in prescribing therapies including anti- diabetes drugs. Aim of this study was to evaluate the appropriateness and the adherence to safety recommendations in the prescriptions of glucose-lowering drugs in hospitalized elderly patients with diabetes. Data for this cross-sectional study were obtained from the REgistro POliterapie-Società Italiana Medicina Interna (REPOSI) that collected clinical information on patients aged ≥ 65 years acutely admitted to Italian internal medicine and geriatric non-intensive care units (ICU) from 2010 up to 2019. Prescription appropriateness was assessed according to the 2019 AGS Beers Criteria and anti-diabetes drug data sheets.Among 5349 patients, 1624 (30.3%) had diagnosis of type 2 diabetes. At admission, 37.7% of diabetic patients received treatment with metformin, 37.3% insulin therapy, 16.4% sulfonylureas, and 11.4% glinides. Surprisingly, only 3.1% of diabetic patients were treated with new classes of anti- diabetes drugs. According to prescription criteria, at admission 15.4% of patients treated with metformin and 2.6% with sulfonylureas received inappropriately these treatments. At discharge, the inappropriateness of metformin therapy decreased (10.2%, P < 0.0001). According to Beers criteria, the inappropriate prescriptions of sulfonylureas raised to 29% both at admission and at discharge. This study shows a poor adherence to current guidelines on diabetes management in hospitalized elderly people with a high prevalence of inappropriate use of sulfonylureas according to the Beers criteria

    Clinical features and outcomes of elderly hospitalised patients with chronic obstructive pulmonary disease, heart failure or both

    Get PDF
    Background and objective: Chronic obstructive pulmonary disease (COPD) and heart failure (HF) mutually increase the risk of being present in the same patient, especially if older. Whether or not this coexistence may be associated with a worse prognosis is debated. Therefore, employing data derived from the REPOSI register, we evaluated the clinical features and outcomes in a population of elderly patients admitted to internal medicine wards and having COPD, HF or COPD + HF. Methods: We measured socio-demographic and anthropometric characteristics, severity and prevalence of comorbidities, clinical and laboratory features during hospitalization, mood disorders, functional independence, drug prescriptions and discharge destination. The primary study outcome was the risk of death. Results: We considered 2,343 elderly hospitalized patients (median age 81 years), of whom 1,154 (49%) had COPD, 813 (35%) HF, and 376 (16%) COPD + HF. Patients with COPD + HF had different characteristics than those with COPD or HF, such as a higher prevalence of previous hospitalizations, comorbidities (especially chronic kidney disease), higher respiratory rate at admission and number of prescribed drugs. Patients with COPD + HF (hazard ratio HR 1.74, 95% confidence intervals CI 1.16-2.61) and patients with dementia (HR 1.75, 95% CI 1.06-2.90) had a higher risk of death at one year. The Kaplan-Meier curves showed a higher mortality risk in the group of patients with COPD + HF for all causes (p = 0.010), respiratory causes (p = 0.006), cardiovascular causes (p = 0.046) and respiratory plus cardiovascular causes (p = 0.009). Conclusion: In this real-life cohort of hospitalized elderly patients, the coexistence of COPD and HF significantly worsened prognosis at one year. This finding may help to better define the care needs of this population

    Clinical features and outcomes of elderly hospitalised patients with chronic obstructive pulmonary disease, heart failure or both

    Get PDF
    Background and objective: Chronic obstructive pulmonary disease (COPD) and heart failure (HF) mutually increase the risk of being present in the same patient, especially if older. Whether or not this coexistence may be associated with a worse prognosis is debated. Therefore, employing data derived from the REPOSI register, we evaluated the clinical features and outcomes in a population of elderly patients admitted to internal medicine wards and having COPD, HF or COPD + HF. Methods: We measured socio-demographic and anthropometric characteristics, severity and prevalence of comorbidities, clinical and laboratory features during hospitalization, mood disorders, functional independence, drug prescriptions and discharge destination. The primary study outcome was the risk of death. Results: We considered 2,343 elderly hospitalized patients (median age 81 years), of whom 1,154 (49%) had COPD, 813 (35%) HF, and 376 (16%) COPD + HF. Patients with COPD + HF had different characteristics than those with COPD or HF, such as a higher prevalence of previous hospitalizations, comorbidities (especially chronic kidney disease), higher respiratory rate at admission and number of prescribed drugs. Patients with COPD + HF (hazard ratio HR 1.74, 95% confidence intervals CI 1.16-2.61) and patients with dementia (HR 1.75, 95% CI 1.06-2.90) had a higher risk of death at one year. The Kaplan-Meier curves showed a higher mortality risk in the group of patients with COPD + HF for all causes (p = 0.010), respiratory causes (p = 0.006), cardiovascular causes (p = 0.046) and respiratory plus cardiovascular causes (p = 0.009). Conclusion: In this real-life cohort of hospitalized elderly patients, the coexistence of COPD and HF significantly worsened prognosis at one year. This finding may help to better define the care needs of this population

    Role of prenatal magnetic resonance imaging in fetuses with isolated mild or moderate ventriculomegaly in the era of neurosonography: international multicenter study

    Get PDF
    Objectives To assess the role of fetal magnetic resonance imaging (MRI) in detecting associated anomalies in fetuses presenting with mild or moderate isolated ventriculomegaly (VM) undergoing multiplanar ultrasound evaluation of the fetal brain. Methods This was a multicenter, retrospective, cohort study involving 15 referral fetal medicine centers in Italy, the UK and Spain. Inclusion criteria were fetuses affected by isolated mild (ventricular atrial diameter, 10.0–11.9 mm) or moderate (ventricular atrial diameter, 12.0–14.9 mm) VM on ultrasound, defined as VM with normal karyotype and no other additional central nervous system (CNS) or extra‐CNS anomalies on ultrasound, undergoing detailed assessment of the fetal brain using a multiplanar approach as suggested by the International Society of Ultrasound in Obstetrics and Gynecology guidelines for the fetal neurosonogram, followed by fetal MRI. The primary outcome of the study was to report the incidence of additional CNS anomalies detected exclusively on prenatal MRI and missed on ultrasound, while the secondary aim was to estimate the incidence of additional anomalies detected exclusively after birth and missed on prenatal imaging (ultrasound and MRI). Subgroup analysis according to gestational age at MRI (< 24 vs ≥ 24 weeks), laterality of VM (unilateral vs bilateral) and severity of dilatation (mild vs moderate VM) were also performed. Results Five hundred and fifty‐six fetuses with a prenatal diagnosis of isolated mild or moderate VM on ultrasound were included in the analysis. Additional structural anomalies were detected on prenatal MRI and missed on ultrasound in 5.4% (95% CI, 3.8–7.6%) of cases. When considering the type of anomaly, supratentorial intracranial hemorrhage was detected on MRI in 26.7% of fetuses, while polymicrogyria and lissencephaly were detected in 20.0% and 13.3% of cases, respectively. Hypoplasia of the corpus callosum was detected on MRI in 6.7% of cases, while dysgenesis was detected in 3.3%. Fetuses with an associated anomaly detected only on MRI were more likely to have moderate than mild VM (60.0% vs 17.7%; P < 0.001), while there was no significant difference in the proportion of cases with bilateral VM between the two groups (P = 0.2). Logistic regression analysis showed that lower maternal body mass index (adjusted odds ratio (aOR), 0.85 (95% CI, 0.7–0.99); P = 0.030), the presence of moderate VM (aOR, 5.8 (95% CI, 2.6–13.4); P < 0.001) and gestational age at MRI ≥ 24 weeks (aOR, 4.1 (95% CI, 1.1–15.3); P = 0.038) were associated independently with the probability of detecting an associated anomaly on MRI. Associated anomalies were detected exclusively at birth and missed on prenatal imaging in 3.8% of cases. Conclusions The incidence of an associated fetal anomaly missed on ultrasound and detected only on fetal MRI in fetuses with isolated mild or moderate VM undergoing neurosonography is lower than that reported previously. The large majority of these anomalies are difficult to detect on ultrasound. The findings from this study support the practice of MRI assessment in every fetus with a prenatal diagnosis of VM, although parents can be reassured of the low risk of an associated anomaly when VM is isolated on neurosonography

    Random Projections for Improved Adversarial Robustness

    Get PDF
    We propose two training techniques for improving the robustness of Neural Networks to adversarial attacks, i.e. manipulations of the inputs that are maliciously crafted to fool networks into incorrect predictions. Both methods are independent of the chosen attack and leverage random projections of the original inputs, with the purpose of exploiting both dimensionality reduction and some characteristic geometrical properties of adversarial perturbations. The first technique is called RP-Ensemble and consists of an ensemble of networks trained on multiple projected versions of the original inputs. The second one, named RP-Regularizer, adds instead a regularization term to the training objective
    corecore